Mon. Not. R. Astron. Soc. 000, 1-20 (2011) Printed August 15, 2011 (MN style file v2.2) 



Only marginal alignment of disc galaxies 



Rene Andrae^* and Knud Jahnke^ 



^ Max-Planck-Institut fiir Astronomie, Kdnigstuhl 17, 69117 Heidelberg, Germany 



O Received 2011 June 10. Accepted 2011 August 11. 

<N 



<: 



o 
u 

6 



> 

o 

in 

(N 
00 

o 



ABSTRACT 

Testing theories of angular-momentum acquisition of rotationally supported disc galax- 
ies is the key to understand the formation of this type of galaxies. The tidal-torque 
theory tries to explain this acquisition process in a cosmological framework and pre- 
dicts positive autocorrelations of angular-momentum orientation and spiral-arm hand- 
edness, i.e., alignment of disc galaxies, on short distance scales of IMpc/h. This disc 
alignment can also c;ause systematic effects in weak-lensing measurements. Previous 
observations claimed discovering these correlations but are overly optimistic in the 
reported level of statistical significance of the detections. Errors in redshift, cUiptic- 
ity and morphological classifications were not taken into account, although they have 
a significant impact. We explain how to rigorously propagate all important errors 
through the estimation process. Analysing disc galaxies in the SDSS database, we find 
that positive autocorrelations of spiral-arm handedness and angular-momentum ori- 
entations on distance scales of IMpc/h are plausible but not statistically significant. 
Current data appears not good enough to constrain parameters of theory. This result 
agrees with a simple hypothesis test in the Local Group, where we also find no evi- 
dence for disc alignment. Moreover, we demonstrate that ellipticity estimates based on 
second moments are strongly biased by galactic bulges even for Scd galaxies, thereby 
corrupting correlation estimates and overestimating the impact of disc alignment on 
weak-lensing studies. Finally, we discuss the potential of future sky surveys. We argue 
that photometric redshifts have too large errors, i.e., PanSTARRS and LSST cannot 
be used. Conversely, the EUCLID project will not cover the relevant redshift regime. 
We also discuss potentials and problems of front-edge classifications of galaxy discs in 
order to improve autocorrelation estimates of angular-momentum orientation. 

Key words: Galaxies: general - Methods: data analysis, statistical. 



1 INTRODUCTION 

Disc galaxies constitute a substantial part of the galaxy pop- 
ulation in the nearby universe (Bamford et al. 2009). As 
these galaxies are rotationally supported, it is of vital im- 
portance to understand how disc galaxies acquire their an- 
gular momentum. The tidal-torque theory tries to explain 
this angular-momentum acquisition through tidal shearing 
from the dark-matter host halo's gravitational field and the 
moment of inertia of the forming protogalaxy (for a recent 
review sec Schafcr 2009). This theory predicts alignment 
effects of disc galaxies, since angular-momentum acquisi- 
tion is partially governed by environmental effects such that 
neighbouring disc galaxies residing in the same environment 
should exhibit similar angular momenta. Hence, testing in- 
trinsic alignments of angular momenta of disc galaxies pro- 
vides a fundamental test for our understanding of galaxy 
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formation in the cosmological framework. Apart from en- 
hancing our understanding of disc-galaxy formation, investi- 
gating these alignment effects is also important because they 

constitute a potentially significant systematic effect in wcak- 
gravitational-lensing surveys (e.g. Crittenden et al. 2001). 

For this goal, we use autocorrelation estimates of 
spiral-arm handedness and galactic angular-momentum- 
orientation vectors, respectively. We revisit the works by 
Slosar et al. (2009) and Lee (2011) and explain that these 
estimations do not take into account all relevant error con- 
tributions and are therefore too optimistic in the reported 
statistical significance. In this article, wo explain how to in- 
corporate the relevant error sources and demonstrate their 
impact on the results. This methodological rigour is also 
in a general sense highly relevant, since at the frontier of 
astropliysical research data analysis can otherwise produce 
misleading results. Typically for methodological studies, the 
basic principle and the techniques presented here are also 
applicable to other astrophysical investigations which in- 
volve the estimation of spatial two-point correlation func- 
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tions, for instances, investigations of baryonic acoustic os- 
cillations (BAOs) (e.g. Blake et al. 2011). Although estima- 
tors used for investigations of BAOs are usually much more 
elaborate than the simple estimator we are going to use, our 
assessment of impact of (star-galaxy) classification and red- 
shift errors also applies to this setting.^ However, we also go 
beyond the purely methodological aspect and discuss the po- 
tential of improving the autocorrelation estimates with new 
surveys in order to eventually obtain astrophysical results. 
In particular, wo discuss the potential of estimating the front 
edges of disc gala;xies via dust extinction in order to improve 
correlation estimates of angular-momentum orientation. 

1.1 Strategy 

Wo start in Sect. 2 by investigating the orientations of 
angular-momentum vectors in the Local Group. This is 
meant as an exercise, motivating the necessity of correlation 
functions. We then present in Sect. 3 the details and selec- 
tion criteria of the data samples we are using. In Sect. 4, we 
explain how to obtain correlation estimates and their corre- 
sponding error estimates for both handedness and angular- 
momentum-orientation vectors. The main body of this arti- 
cle is Sect. 5, where we explain the difference between con- 
ditional and marginal errors, discuss the relevant error con- 
tributions, and explain how to propgatc errors numerically 
by simple Monte-Carlo sampling. In that section, we esti- 
mate marginal autocorrelations of handedness and angular- 
momentum orientations, respectively. This is also the sec- 
tion relevant to readers who are interested in marginal es- 
timates of correlation functions in general, e.g., in the con- 
text of baryonic acoustic oscillations. In an attempt to im- 
prove the statistical significance of our results by replacing 
isophotal ollipticity estimates by less noisy estimators, we 
show in Sect. 6 that cllipticitics based on second moments 
are strongly biased. Wc clearly demonstrate that this bias 
corrupts correlation estimates. We outline possible improve- 
ments and the potential of future sky surveys in Sect. 7. We 
discuss our final results and conclude in Sect. 8. 



2 ARE ANGULAR MOMENTA RANDOMLY 
ORIENTED IN THE LOCAL GROUP? 

As wc are investigating the alignment of angular momenta 
of disc galaxies, the Local Group is a natural first testbed. 
Apart from numerous dwarf galaxies, the Local Group con- 
sists of four disc galaxies, namely the Milky Way, An- 
dromeda (M31), M33, and the Large Magellanic Cloud 
(LMC), all with pairwise distances of less than IMpc. 

2.1 Angulcir- momentum orientation of the Milky 
Way 

We start by estimating the angular-momentum-orientation 
vector of the Milky Way in equatorial coordinates. In order 

^ The perfect BAO estimator would be a generative model that 
for every galaxy predicts the redshift and star-galaxy classifica- 
tion probability based on the observation conditions. This would 
enable us to directly take into account these error sources. 



to estimate the angular-momentum-orientation vector of the 
Milky Way, we need two ingredients: 

(i) The unit vector pointing from the Galactic centre 
to the position of the Sun. 

(ii) The unit vector vq of the Sun's velocity on its trajec- 
tory around the Galactic centre. 

Given the valid assumption that the Sun lies inside and is 
co-rotating with the Galactic disc, we can then compute the 
Milky Way's angular-momentum-orientation vector via 

Lmw = f© X W0 . (1) 

We can infer tq from the equatorial coordinates of the 
Galactic centre, qgc ~ 266.42° and (5gc ~ —29.01°. Here, 
we have to keep in mind that these coordinates are point- 
ing from the Sun towards the Galactic centre, i.e., f© is the 
inverted direction, 

(cosaoc sin(90° — 5gc) \ 
sin ace sin(90° - 5gc) . (2) 
cos(90° - 5gc) / 

The unit vector v© has to be inferred from the rotation of 
the Galactic disc. By definition, Vq points into the direc- 
tion specified by Galactic longitude t = 90° and Galactic 
latitude 6 = 0° (e.g. Brunthaler et al. 2005). In equatorial 
coordinates, this direction is given by w 318.00° and 
5v ^ 48.33°, such that 

(cosa„ sin(90° — (5„) \ 
sinoi, sin(90° — (5„) | . (3) 
cos(90°-(5„) / 

Inserting these values into Eq. (1), we obtain the following 
estimate of the angular-momentum-orientation vector of the 
Milky Way, 

/ 0.86771 \ 

0.19878 . (4) 
\ -0.45560 / 

We conduct two cross-checks: First, the two unit vectors f© 
and w© should be orthogonal and indeed their scalar prod- 
uct is tq-vq k, 0.00097 -C 1. Second, Lmw by construc- 
tion should be normal to the plane of the Galactic disc, i.e., 
parallel or antiparallel to the unit vector pointing into the 
direction of the Galactic North pole, whose equatorial co- 
ordinates are given by qnp « 192.86° and (5np ^ 27.13°. 
Indeed, the scalar product is Lmw ■ iinp ~ —0.9999992, i.e., 
both vectors are almost perfectly antiparallel. 

2.2 Angular-momentum orientations of 
Andromeda, M33, and the LMC 

In order to estimate the angular-momentum orientations of 
Andromeda, M33, and the LMC, we use the formalism de- 
scribed in Lee (2011) which is based on ellipticity estimates 
and the assumption of intrinsically circular galactic discs. 

2.2.1 Andromeda (M31) 

For Andromeda, we adopt an inclination angle of 77° (Wal- 
terbos & Kennicutt 1988) and an orientation angle of 38° 
(Walterbos & Kennicutt 1987). Furthermore, dust lanes en- 
able us to identify the front edge of Andromeda's galactic 
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disc, which is the North- Western edge. Our front-edge es- 
timate agrees with the result of lye & Ozawa (1999) who 
investigated the reddening of globular clusters as a func- 
tion of height above the major axis. Given its equatorial 
coordinates qmsi ~ 10.69° and 5m3i ~ 41.27°, wc can 
compute the angular-momentum-orientation vector of An- 
dromeda up to its sign. Chemin et al. (2009) published spa- 
tially resolved HI spectra of M31, which enable us to infer 
the disc rotation directly. Their map of radial velocities di- 
rectly implies that the Nortli-Eastcrn part is receding from 
us, whereas the South- Western part is rotating towards us. 
Consequently, the angular-momentum-orientation vector of 
Andromeda points South-East and away from our own po- 
sition. Therefore, if we project Lm3i onto the unit direction 
vector pointing from the Milky Way towards Andromeda, 
this projection must be positive. This condition enables us 
to fully determine the angular-momentum-orientation vec- 
tor of Andromeda, 



Lm31 



-0.08031 
-0.79651 
0.59926 



(5) 



2.2.2 Triangulum Galaxy (M33) 

Concerning M33, we adopt an inclination angle of 49° and 
an orientation angle of 21° (Corbelli & Schneider 1997). 
M33 clearly is a right-handed (Z-wise) spiral. This rotational 
sense agrees with the results of Brunthaler et al. (2005) who 
observed the proper motion of two H2O masers in M33. It 
also agrees with the results of Putman et al. (2009), who 
measured the radial-velocity field of HI gcis in M33. Again, 
this implies that the projections of both possible front-edge 
configurations of Lmss onto the unit direction vector point- 
ing from the Milky Way towards M33 have to be positive. 
Unfortunately, M33 does not exhibit dust lanes, such that 
the front edge remains unknown. This is not surprising since 
M33 is not as highly inclined as Andromeda such that we 
are less likely to observe a dust lane. From dust reddening 
of C-rich AGB stars Cioni et al. (2008) concluded that there 
is weak evidence that the North- Western side of M33 is the 
front-edge. Given its equatorial coordinates umss ~ 23.46° 
and (5m33 30.66°, the angular-momentum-orientation vec- 
tor of M33 then reads 



0.67170 
-0.47655 
0.56721 



(6) 



The front-edge estimate of Cioni et al. (2008) is still rather 
uncertain (see their Fig. 9). However, it is sufficient for this 
exercise. 



2.2.3 Large Magellanic Cloud (LMC) 

Concerning the LMC, we adopt an inclination angle of 35° 
and an orientation angle of 123° (van der Marel & Cioni 
2001). Furthermore, van der Marel & Cioni (2001) find clear 
evidence that the North-Eastern side of the disc is the front- 
edge (their Fig. 5). The rotational sense of the LMC is right- 
handed as is evident from observed velocity fields (e.g. Olsen 
& Massey 2007). Given its equatorial coordinates olmc ^ 
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Figure 1. KS-test of angular-momentum-orientation vectors in 
the Local Group. Step function: Empirical (unbiased) estimate of 
cumulative distribution for the Local Group. Dashed Unc: Cumu- 
lative distribution of null hypothesis of random orientations. The 
maximum KS-distance is Dmax ^ 0.385. 



80.8938° and Jlmc ~ —69.7561°, the angular-momentum- 
orientation vector of the LMC then reads 



-i'LMC 



-0.29699 
-0.46945 
-0.83152 



(7) 



2.3 Random orientation 

Are the angular-momentum-orientation vectors in the Local 
Group compatible with the null hypothesis of random orien- 
tation? In order to test this, we investigate the distribution 
of projection values. For the four disc galaxies, there can 
only derive three statistically independent projections. We 
choose the projections onto the Milky Way: 

• Z/MW • LyiZl ~ —0.5010 

w -1-0.2297 

• Lmw • ^LMC ^ -1-0.0278 

Adding further projection values, e.g., Lm-ai ■ ^m33, would 
introduce correlations compromising the KS-tcst. Figure 1 
shows the resulting cumulative distribution of projection val- 
ues for the Local Group. Furhtermore, Fig. 1 shows the cu- 
mulative distribution for the null hypothesis where all pro- 
jection values are equally likely. The KS-distanco is then 
Dma-x ~ 0.385 which yields a p-value of « 0.648 (Press et al. 
2002). Consequently, with 64.8% probability we make a mis- 
take if we reject the null hypothesis of randomly oriented 
angular-momentum-orientation vectors in the Local Group. 

We conclude from this simple hypothesis test that there 
is no evidence that disc alignment is at work in the Lo- 
cal Group. However, this hypothesis test is rather coarse 
given the small number of disc galaxies and the neglection 
of galaxy separations. Hence, a more elaborate investiga- 
tion naturally leads us to spatial autocorrelation functions 
estimated from large samples of disc galaxies as the key di- 
agnostic tool for investigations of disc alignment. 



3 THE DATA 

An autocorrelation analysis of angular momenta requires a 
survey covering a large area with homogeneous galaxy mor- 
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phologies in order to (a) select disc galcixies and (b) estimate 
their three-dimensional angular-momentum-orieritation vec- 
tors. The best database for this purpose is the SDSS. We 
exploit visual morphological classifications from the Galaxy 
Zoo project and automated classifications from Hucrtas- 
Company et al. (2011), enhanced by additional information 
from the general SDSS database. 



3.1 Galaxy Zoo 

Galaxy Zoo (Lintott et al. 2008; Bamford et al. 2009; Lin- 
tott et al. 2011) is a unique project where the morphology of 
nearly 900,000 galaxies from the Sloan Digital Sky Survey 
(SDSS) spectroscopic sample have been classified visually 
by the internet community. Each galaxy has been classi- 
fied multiple times by different internet users, which pro- 
vides a probabilistic object-to-class assignment. Concerning 
galaxy morphologies, such a probabilistic assignment is more 
physical than a hard assignment, as has been discussed by 
Andrae et al. (2010). In detail, the Galaxy Zoo database 
provides probabilistic assignments to the following morpho- 
logical classes: 

• elliptical, J3§p, 

• disc, Pdi.c, 

• edge-on disc, p^l^^, 

• clock-wise/Z-wise spiral in projection, pz^, 

• anti-clock-wise/S-wise spiral in projection, Ps ^, 

• merger, p^^. 

All probabilities that are taken from Galajcy Zoo carry a 
"GZ" superscript. The normalisation is given by 



GZ , GZ , GZ , GZ , GZ , GZ 
Pell + Pdisc + Pedge + Pz + PS + Pmg 



1. 



(8) 



Laud et al. (2008) reported a bias in the handedness 
classifications, p^^ and ^ , where more spiral galaxies are 
classified as S-wise than as Z-wise.^ This bias is corrected in 
an asymmetric, additive fashion by Land et al. (2008) and 
Slosar et al. (2009) in order to enforce that the proportions 
of Z-wise and S-wisc spirals arc equal with regard to the 
whole sample. In contrast to this, we employ a symmetric, 
additive bias correction of the form 



GZ , , 
Pz=Pz +b 



and 



GZ , 
PS = Ps -b, 



(9) 



where b is chosen such that the numbers of Z-wise and S-wise 
spirals are identical. There are two reasons: 

(i) The symmetric correction preserves the normalisation 
of Eq. (8). This is important because in contrast to Slosar 
et al. (2009) we arc handling the Galaxy Zoo results fully 
probabilistically in our analysis (cf. Sect. 5.2). 

(ii) Demanding that the proportions of Z-wise and S-wisc 
spirals are equal only provides a single condition, such that 
an asymmetric correction with two biases, bz and 6s, is not 
fully constrained and therefore arbitrary. 

Our value of b is 0.0105 and thus similar to Land et al. 
(2008). Slosar et al. (2009) argued that such a bias can only 

Land et al. (2008) also used flipped galaxy images and still 
observed an excess of S-wise over Z-wise spirals in visual classi- 
fications. The exact origin of this bias is unknown, though one 
option considered by Land et al. (2008) is a psychological effect. 



lead to a constant offset in the handedness autocorrelation 
function, but it cannot feign a distance-dependent autocor- 
relation, which is the predicted astrophysical signal. 

3.2 Catalogue of Huertas- Company et al. (2011) 

Similar to the Galaxy Zoo project, Huertas-Company et al. 
(2011) performed a morphological classification on the SDSS 
spectroscopic galaxy sample. There are two important dif- 
ferences with respect to Galaxy Zoo: 

(i) The morphological classes are: 

• elliptical, p^p, 

• SO galaxy, p^^, 

• Sab disc galaxy, pf^, 

• Scd disc galaxy, psSi- 

All probabilities taken from the catalogue of Huertas- 
Company et al. (2011) carry a superscript "HC". 

(ii) Instead of visual inspection, a support-vector ma- 
chine, i.e., an automated classification algorithm, has been 
used in order to classify the galaxies. 



The normalisation reads 



HC 1 HC I HC I HC t 
Pell + PSO + PSab + PScd = 1 



(10) 



As mentioned in Huertas-Company et al. (2011), Sect. 3.1 
therein, the "Scd" class not only contains Scd galaxies but 
also irregular galaxies. 



3.3 Additional information from the SDSS 

database 

The Galaxy Zoo catalogue and the catalogue of Huertas- 
Company et al. (2011) have been cross-matched with 
the general SDSS database via the spectral object IDs 
(SpecOb,jID) of the galaxies. We exploit this matching in 
order to obtain additional information about the visually 
classified galaxies. In particular, we retrieved the following 
information from the SDSS database: 

• r-band ellipticity: 

- isophotal axis ratio and orientation angle, 

- Stokes parameters and including their errors, 

• (circular) Petrosian radii in r- and i-band, 

• spectroscopic redshift estimate and its error. 

The two Stokes parameters Q and U encode the complex 
ellipticity (e.g. Bartelmann & Schneider 2001) 



e = e++iex = Q + iU = 



1 + g 



(11) 



where q = b/a denotes the ratio of semi-minor over semi- 
major ax:is and 9 denotes the orientation angle. Prom the 
spectroscopic redshift estimate, z, we estimate the comoving 
distance. 



d{z) 



HoJo (1 



dz 



(1 + z)^^{i + zy^fi^ + ^A 



(12) 



assuming a ACDM cosmology with parameters Ho = 
WOhkms~^Mpc~^, Qa = 0.734 and fi^ = 1 - (Lar- 
son et al. 2011). These distance estimates may suffer from 
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peculiar motions of the galaxies (see discussion in Sect. 7.3 
or "Fingers-of-god" effect in, e.g., Hamilton 1998). Using 
equatorial coordinates right ascension and declination angles 
{a,S), we convert to a three-dimensional, global, spherical 
coordinate system with polar angles if = a and i9 = n/2 — 5. 
The position r of each galaxy is then simply given by 

Cd cos ip sin & \ 
d sin if sin t) . (13) 
d cos d J 

Distances between two galaxies are then computed via Eu- 
clidean distances |fi — f2|, i.e., we assume that the Euclidean 
metric docs not change with rcdshift. This is a reliable ap- 
proximation, since the galaxies in our sample span a redshift 
range where nonlinear cosmological effects are negligible. 

3.4 Data selection 

Not all objects in the catalogue can be used for our analysis. 
Some objects have to be removed for various reasons. We 

now describe the data selection for the two galaxy samples 
used to estimate correlations in handedness and angular- 
momentum orientations. 

3.4.- 1 Handedness sample 

First, starting from the Galaxy Zoo sample, wc select all 
galaxies with either ^ 0.778 or pg ^ J? 0.8, which results 
in 36,999 galaxies. These asymmetric probability thresholds 
are chosen this way in order to allow for some flexibility in 
the correction of the handedness bias of 6 = 0.0105. 

Second, we obtained the r-band Petrosian radii from the 
SDSS Galaxy table, the spectroscopic redshift estimate and 
its error estimate from the SDSS SpecObjAll table. Actu- 
ally, all objects in the Galaxy Zoo sample have been selected 
from the SDSS spectroscopic sample. For reasons unknown 
to us, we could not find 103 objects in the Galaxy table 
and another 5,106 objects were untraceable in the SpecOb- 
jAll table.^ This leaves us with 31,790 objects with r-band 
Petrosian radius and estimates of spectroscopic redshift and 
its error. 

Third, we remove multiple objects from the sample, i.e., 
extended galajcies that have been shredded by the SDSS 
pipeline producing multiple entries of a single object. We au- 
tomatically removed galaxy pairs whose angular separations 
were less than 1.5 times the maximum r-band Petrosian ra- 
dius of both galaxies. Furthermore, Slosar et al. (2009) re- 
moved another 69 objects through visual inspection. This 
list has been kindly provided by Anze Slosar such that we 
are capable of removing these objects, too. This leaves us 
with 31,621 galaxies. 

Finally, we apply the additive and symmetric bias cor- 
rection of the handedness classifications given by Eq. (9). 
Naively interpreting any galaxy with pz ^ 0.8 — 6 as Z-wise 

The Galaxy Zoo database provides the SDSS ObjID, which 
was used to identify objects in the Galaxy table. Cross-matching 
with the SpecObjAll table was done by retrieving the SpecOb- 
jID from the Galaxy table or — if this label was unavailable 
- by matching the given ObjID with the BestObjID from the 
SpecObjAll table. 



spiral and any galaxy with ps ^ 0.8 -|- 6 as S-wise spiral, 
we end up with 15,083 Z-wise and 15,071 S-wise spirals for 
a bias correction of 6 = 0.0105. Therefore, our sample is 
slightly smaller than the one used by Slosar et al. (2009). 

3.4-2 Angular-momentum- orientation sample 

Based on the catalogue of morphological classifications by 
Huertas-Company et al. (2011), we select those galaxies 
with spectroscopic redshifts Q < z ^ 0.02 and probability 
PsS > 0-5 to be a galaxy of type Sc or Sd. This leaves us with 
4,236 galaxies satisfying these criteria, the same number of 
objects as reported by Lee (2011). For 25 of these objects 
we could not find any information in the SDSS database, 
i.e., estimates of r-band Petrosian radii, Stokes parameters, 
their errors, and error estimates of spectroscopic redshift are 
missing. For these objects, we set the spectroscopic redshift 
error to 10^*, which is a typical value for this sample. Pet- 
rosian radii are set to zero. Using the automated method 
described above, we find 20 rogue pairs in this sample. For 
each pair, we randomly discard one of the two galaxies, such 
that we are left with a sample of 4,216 Scd galaxies. 

3.5 From axis ratio to angulcir-momentum 

orientation 

The orientation of the angular-momentum-orientation vec- 
tor has to be inferred from the observed galactic disc by 
invoking several assumptions. We follow the formalism de- 
scribed, e.g., in Lee (2011) in order to estimate the angular- 
momentum-orientation vector from the observed axis ratios, 
elliptical orientation angles, and equatorial coordinates. In 
fact, we already used this formalism in Sect. 2.2. If not spec- 
ified otherwise, wc adopt the same correction for disc thick- 
ness like Lee (2011), who assumed an intrinsic axial ratio 
of p = 0.1 for Scd galaxies based on Haynes & Giovanelli 
(1984). For later purposes, we note that Heidmann et al. 
(1972) compared different estimates of the intrinsic axial ra- 
tios and find values between p = 0.083 and 0.145 for Scd 
galaxies. 



4 CORRELATION ESTIMATORS 

In this section, we discuss the correlation estimators for 
angular-momentum orientations and handedness. Wc also 
explain how to estimate errors. We start by explaining 
the general formalism and then specialise on both angular- 
momentum orientations and handedness. 

4.1 Simple correlation estimator 

Given two random variates X and Y, we want to esti- 
mate their correlation ^xy and its error. If A'^ samples 
xi,X2, ■ ■ ■ ,xn and j/i , 1/2, • • • , J/jv have been drawn from X 
and Y and are independent and identically distributed, a 
simple correlation estimator'' is given by, 

= {{X - {X}){Y - (Y))) = (XY) - {X){Y) , (14) 

* More elaborate estimators can be defined. Equation (14) is the 
maximum-likelihood estimate, if and only if (X, Y) are drawn 
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where the hat on £,xy indicates an estimator and 

1 ^ 

n=l 

N N 
n=l n=l 

Merely obtaining a value of ^xy via Eq. (14) alone is not 
informative in any way. We also need an error estimate for 
^XY in order to get a meaningful result. As ^xy is the mean 
of {X — {X)){Y — (Y)), the variance of ^xy is given by the 
variance of {X-{X}) {¥-{¥}) divided by N.^ Consequently, 
we obtain the error estimate 

^ a{{X - {X)){Y - {¥))) 

Here we assume that A'^ is large enough such that the likeli- 
hood function of the mean {{X — {X)){Y — (Y))) is approxi- 
mately Gaussian and we are allowed to take the square-root 
of the variance in order to obtain a standard deviation "cr". 



4.2 Angular-momentum orientation 

Our aim is to estimate the scalar two-point autocorrelation 
function of angular-momentum orientations, ^LL(r). Here, 
we assume spherical symmetry such that ^{f) = (,{r). This 
is a first-order approximation because the spatial distribu- 
tion of galaxies in the universe is not isotropic on short 
scales ("Cosmic Web"). Usually, the following estimator is 
employed (e.g. Pen et al. 2000; Lee 2011), 

|LL(r) = {Pap'a\La ■ ^al^) + (PaPb\La ■ ifcl^) 

+ {PbP'a\Lb ■ i'al^) + {PbP'b\Lb ■ L'b\^) - ^ , (18) 

where primes indicate the second galax;y in the pair and 
subscripts a, b denote the two possible orientations of the 
disc's front edge with probabilities pa and pb- If the front 
edge is not estimated, the default values are Pa = Pb = ^■ 
Introducing the abbreviation Z — Pap'ai\La ■ L'a^ + l^a • 
LjI^ 4- \Lh ■ L'a^ + \L'b ■ L'b^^-, an error estimate of ^LL(r) is 

*(^ll) = ^ , (19) 

where N denotes the number of galaxy pairs in the relevant 
distance bin. 



from a bivariate Gaussian, i.e., if there are no higher-order corre- 
lations. 

^ Mark the following important difference; If we are interested in 
estimating some random variate Z, we employ its mean (Z) and 
its variance {Z"^) — {Z)^ . However, in this case we arc not inter- 
ested in estimating Z but in estimating the mean of Z and the 
variance of {Z) equals the variance of Z divided by the number of 
samples drawn from Z. Loosely speaking, if we draw more sam- 
ples from Z, the distribution of Z does not change, in particular 
its width (variance) stays constant. However, drawing more sam- 
ples from Z enables us to estimate the mean of the distribution 
more accurately. 



4.3 Handedness 

We also want to estimate the two-point autocorrelation func- 
tion of handedness ^hh('"). Again assuming spherical sym- 
metry, a general estimator is given by, 

&H(r) = (ftfe'), (20) 

where we have defined the handedness 

h = pz-ps ■ (21) 

As explained in Sect. 3.1, the mean handedness is zero in the 
whole sample, i.e., (h) = {h'} — 0. Handedness alignments 
cannot change this in individual distance bins if the number 
of galaxy pairs is large enough. In every distance bin, let n+ 
denote the number of galaxy pairs with hh' = +1 and n_ 
the number of galaxy pairs with hh' = —1. We can then 
rewrite Eq. (20) to read 

iMr) = =/+-/- = 2/+ - 1 , (22) 

where /± = n±/(n+ -I- n_) denotes the fraction of galaxy 
pairs with positive or negative handedness products, respec- 
tively. An error estimate of ^HH(r) is obtained from the fact 
that counting positive handedness products is a Bernoulli 
trial, i.e., n± are subject to the binomial distribution while 
/± are subject to the beta distribution (e.g. Cameron 2011). 



5 THE IMPACT OF ERRORS 

This section is dedicated to a detailed investigation of the 
impact of various error sources on autocorrelation estimates 
of handedness and angular-momcntum-oricntation vectors, 
respectively. As key results, wc finally provide marginal esti- 
mates of these autocorrelation functions which take into ac- 
count all relevant error sources. Like Lee (2011), we employ 
isophotal ellipticity estimates as far as angular-momcntum- 
orientation vectors are concerned. However, our methodolog- 
ical discussion of error propagation is also relevant in a wider 
context, e.g., concerning correlation functions for investiga- 
tions of baryonic accoustic oscillations. 

5.1 Conditional vs. marginal errors 

Previous estimates (e.g. Slosar et al. 2009; Lee 2011) employ 
certain input parameters such as redshift estimates using 
only their maximum-likelihood values, without propagating 
the errors of these values. Hence, these estimates are con- 
ditional instead of marginal estimates.® Consequently, we 
now need to explain the conceptual difference between con- 
ditional and marginal errors. 

For the sake of simplicity, let us consider fitting data 
D with Gaussian noise using a model with two linear pa- 
rameters Q\ and Qi- In this case, the likelihood function 
L(D\Q\,0'i) is a bivariate Gaussian also in the linear pa- 
rameters and its covariance matrix 

S = f P^^T^ \ (23) 

^ Slosar et al. (2009) derived pseudo-marginal estimates of the 
handedness autocorrelation function. Although they marginalised 
their likelihoods, they used conditional input data. 
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can be found by a Fisher analysis (e.g. Heavens 2009). 
Here, al and o"! denote the variances of 61 and 62, whereas 
— 1 ^ P12 1 is the correlation coefficient. tJi is the stan- 
dard deviation of this Gaussian if sliced at the mean value of 
62- Therefore, ai is the conditional error otOi, "conditional" 
because it depends on where the Gaussian has been sliced, 
i.e., the mean value of 62- Conversely, the marginal error of 
9i is independent of the value of 62- This marginal error is 
obtained by projecting the bivariate Gaussian onto the 61- 
axis, instead of slicing it. Marginal errors are never smaller 
than conditional errors. Consequently, the conditional error 
(Ji underestimates the true error on 61, such that, e.g., sta- 
tistical significance is overestimated. 



5.2 Uncertainties in classifications 

The morphological classifications of Galaxy Zoo and 
Huertas-Company et al. (2011) are probabilistic, i.e., ev- 
ery object is assigned a probability to belong to either of 
the possible morphological types. This is in contrast to 
non-probabilistic - "hard" - assignments, where every ob- 
ject is clearly assigned to a certain type. Hard assignments 
are easier to carry out and interpret, wherefore many as- 
tronomers have a natural affinity to this approach. Unfor- 
tunately, galaxy morphologies cannot be clearly assigned to 
morphological types in general - apart from singular proto- 
typical examples of very obvious morphology. The bulk of 
galaxies has uncertain morphologies, i.e., the morphological 
types are overlapping such that hard classification schemes 
are biased (Andrae et al. 2010). For instances, a galaxy with 
pz = 0.8 still has a 20% chance not to be a Z-spiral - or a 
disc galaxy at all.^ Discarding the classification uncertainty 
by introducing a hard cut pretends that the data is more 
accurate than it actually is. This inevitably leads us to un- 
derestimate the errors, thereby compromising estimates of 
statistical significance. 

In fact, Slosar et al. (2009) turned the probabilistic as- 
signments of Galaxy Zoo into hard assignments by intro- 
ducing a hard cut: For the clean sample, every galaxy with 
Pz 0.8 is considered as Z-wise spiral and every galaxy 
with PS ^ 0.8 is considered as S-wise spiral, while all other 
galaxies are discarded. Similarly, Lee (2011) considers ev- 
ery galaxy with Ps^ > 0.5 as an Scd galaxy. We explain 
in Sects. 5.7 and 5.9 how to account for these classifica- 
tion uncertainties in estimating the correlation functions of 
handedness and angular-momentum orientations. There is 
no reason that enforces such a hard cut. 



^ The Galaxy Zoo probabilities may exhibit minor biases due to 
people voting incorrectly out of confusion or malice. However, Lin- 
tott et al. (2008) weighted the users depending on how their votes 
agreed with the majority. Moreover, on average, every galaxy has 
received 39 votes (Land et al. 2008) such that the impact of delib- 
erate misclassification should give rise to a minor bias only. Cer- 
tainly, that effect is much smaller than the bias we would catch 
up, if we cut the classification probabilities. In fact, it is very hard 
to do worse than a discontinuous hard cut. Any reasonable con- 
tinuous transition between two classes is virtually guaranteed to 
be a better approximation to reality than a hard cut which cor- 
responds to a discontinuous stop in such a two-class transition. 
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Figure 2. Likelihood function of comoving distance for a galaxy 
with spectroscopic redshift of ^ = (6.5993 ± 0.0078) • 10-^. The 
likelihood has been estimated by drawing 10,000 Monte-Carlo 
samples from the error distribution of the spectroscopic redshift 
and is approximately Gaussian with mean (183.16 it 0.20)Mpc/h. 

5.3 Errors in spectroscopic redshift estimates 

Both autocorrelation functions introduced in Sects. 4.2 
and 4.3 require estimates of distances of galaxy pairs and 
these distances are uncertain due to errors in the redshift 
estimates. In order to assess the impact of redshift errors, 
we randomly select a single galaxy from our SDSS subsam- 
ple and draw 10,000 Monte-Carlo samples from its redshift- 
error distribution. For every sampled value of redshift, we 
compute the comoving distance and monitor its distribu- 
tion. As is evident from Fig. 2, the errors in the comoving 
distances are of the same order of magnitude as the typical 
distance scale of the correlations reported in the literature 
(~ lMpc//i). Consequently, these errors are important and 
have to be taken into account. We explain in Sect. 5.5 how 
to propagate these redshift errors by Monte-Carlo sampling. 

5.4 Errors in ellipticity estimates 

Errors in ellipticity estimates used as proxies for disc in- 
clination clearly have an impact on the estimation of the 
angular-momentum orientations and their correlation func- 
tion. We now try to estimate these errors. We explain in 
Sect. 5.5 how to propagate ellipticity errors by Monte-Carlo 
sampling. 

First, considering the isophotal ellipticities used by Lee 
(2011), the SDSS database unfortunately does not offer error 
estimates.* Consequently, employing isophotal ellipticites, 
the SDSS database strictly does not enable us to estimate 
a marginal autocorrelation function. In order to get a rough 
estimate of the errors in isophotal ellipticities, we make use 
of the rogue pairs in the SDSS database, i.e., multiple entries 
of identical galaxies. Starting out from 698,420 galaxies in 
the classification table provided by Huertas-Company et al. 
(2011), we identify rogue pairs as galaxy pairs whose angular 

* In fact, the table GALAXY contains columns for the errors of 
the isophotal ellipticities. However, for the relevant objects these 
columns are only filled with invalid default values. 
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Figure 3. Distributions of differences in isopfiotal axis ratios 
(top) and isophotal orientation angles (bottom) for tfie 1,596 
rogue pairs in the catalogue of Huertas-Company et al. (2011). 
The top panel is well approximated by a Gaussian with mean zero 
and standard deviation of 0.0795 (dashed line), which yields 
an error estimate of a{qiso) = ^ ^J^^ ~ 0.0562. The distribution 
of differences in orientation angles (bottom panel) is not Gaus- 
sian, but described by the ad-hoc model of Eq. (26) (dashed line) 
based on Eq. (25) with manually adjusted parameters a 0.73, 
(Ti 2.7° and ^2 ~ 15.0°. 

separation is less than 0.4 arcsec, which roughly corresponds 
to one pixel size.^ We find 1,596 such pairs. We then monitor 
the difference in axis ratios and orientation angles of every 
pair. The resulting distributions are shown in Fig. 3. As 
rough error estimate for the isophotal axis ratio, we obtain 
a standard deviation of 

a(giso) ~ 0.0562 , (24) 

when fixing the mean to zero. The distribution of differences 
in orientation angles is not Gaussian but has more promi- 
nent wings. We therefore model the likelihood function of 
orientation angles with mean angle 9o as a mixture of two 
Gaussians of different width, 

C{e\eo,a^,a2,a) = aN{e\9o,ai) + {l^a)N{e\eo,a2) . (25) 

The bottom panel of Fig. 3 displays the distribution of dif- 
ferences of two values drawn from Eq. (25), whose likelihood 
is obtained by convolving £{6\9o, ui, a2, a) with itself. The 
resulting likelihood function then reads 

C{A6\ai,a2,a) = a'^ N{A9\0, V2ai) 

+2a{l - a)N{Ae\0, ^ al + cr|) -f (1 - afN{Ae\Q, ^02) . 

(26) 

Manually adjusting the model parameters of Eq. (25), we 



obtain a rough error estimate for the isophotal orientation 
angle with parameters a. ~ 0.73, ai ~ 2.7° and 02 ~ 15.0°. 
If we required an angular separation of 0, i.e., identical co- 
ordinates, we would still end up with 17 pairs exhibiting 
similar scatter in both parameters. 

Second, the correction for intrinsic axial ratios of Scd 
galaxies is subject to uncertainties, too. Wherever we ne- 
glect ellipticity errors, we also neglect errors in intrinsic ax- 
ial ratios and simply adopt p = 0.1. Conversely, if we take 
into account ellipticity errors, we will automatically also take 
into account errors in the intrinsic axial ratio. In this case, 
we assume that p is drawn from a uniform distribution over 
the interval [0.083,0.145] (see Sect. 3.5). 

5.5 Propagating errors numerically 

We now explain how to incorporate errors in redshift esti- 
mates and ellipticity estimates. The crucial problem is that 
both errors cannot be propagated analytically. 

We propagate the measurement errors of spectroscopic 
redshift and ellipticity by drawing 1,000 Monte-Carlo reali- 
sations from the error distributions of both parameters and 
averaging the results over all Monte-Carlo realisations. A 
value for the intrinsic axial ratio is drawn from the uniform 
interval [0.083, 0.145] once for every Monte-Carlo realisation, 
i.e., in each realisation all galaxies have the same correction 
for intrinsic axial ratio. This Monte-Carlo sampling is in 
fact a marginalisation over the errors of both observables, 
spectroscopic redshift and ellipticity. Typically, both error 
sources are not taken into account (e.g. Slosar et al. 2009; 
Lee 2011), which yields correlation estimates with condi- 
tional errors - conditional because they assume, e.g., the 
observed redshifts were the true ones. 

A final remark concerning the correlation estimate: We 
monitor the distribution of the correlation values ^ resulting 
from the 1,000 Monte-Carlo realisations. However, a funda- 
mental difference to Eq. (17) is that now ^ itself is a random 
variate. Consequently, we are now interested in the variance 
of 5 but not in the variance of the mean of 5. The difference 
is a factor of 1,000 in the variances. It is obvious that this 
approach is correct, since otherwise we could make the re- 
sulting errors arbitrarily small by increasing the number of 
Monte-Carlo realisations. 



5.6 Negligible error sources 

There are further sources of errors which could be taken into 
account but are not relevant in our case. 

For instances, uncertainties in the cosmological param- 
eters have an impact on the comoving distances. In our case, 
this is irrelevant because all galaxies are affected the same 
way. However, if in a different context the task is to use 
marginal autocorrelation functions in order to do cosmolog- 
ical inference, it may be mandatory to also incorporate un- 
certainties of cosmological parameters into the Monte-Carlo 
sampling described in Sect. 5.5. We experienced that this 



^ Here, we assume that for multiple entries the whole galaxy is 
used for parameter estimation and not only a shredded part of 
the galaxy. 



Analysing multiple Monte-Carlo realisations is of course com- 
putationally expensive. However, this task is still easily executed 
on a standard computer. 
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increases the error in comoving distances by approximately 
a factor of two. 

Another negligible error source is the position estimate 
of a galaxy in equatorial coordinates, which is obviously 
much smaller than, e.g., any rcdshift error. Given the pixel 
size of 0.4 arcsec of SDSS, at a redshift oi z = 0.066 
and comoving distance of d = 183Mpc/h one pixel mises- 
timation corresponds to a transversial error of 0.35kpc/h. 
This is several orders of magnitude below the theoretically 
expected correlation length of roughly IMpc/h (Schafer & 
Merkel 2011). 

5.7 Impact on autocorrelation of handedness 

In this section, we discern the impact of various error sources 
on estimates of the handedness autocorrelation function, 
namely classification uncertainties, redshift errors and num- 
ber statistics. Our ultimate goal is a marginal estimate of the 
handedness autocorrelation function, where all errors have 
been marginalised out. 

First, wo use the hard estimator from Slosar et al. 
(2009), which does not account for uncertainties in classi- 
fication and redshift. Panel (a) of Fig. 4 shows our estimate 
of this autocorrelation function for the Galaxy Zoo sample. 
Qualitatively, our results agree with the results of Slosar 
ct al. (2009). Wc observe positive correlations, i.e., an align- 
ment of handedness, on short distances, too. Although there 
are minor differences which might arise from the slightly 
different data sets used, the general agreement validates our 
method. 

Second, we take into account uncertainties in the hand- 
edness classifications, but still ignore redshift errors. In every 
distance bin, we compute the handedness products 

hh' = (pz-ps)(p'z-p's) = Pzp'z+Psp's-Pzp's-Psp'z , (27) 

which can now take any value in the interval [—1, 1]. The 

correlation estimator of Eq. (22) is unchanged. However. n± 
now are not the number of pairs where hh' = ±1, but are 
rather defined by 

n+ = y^(pzpz +PsPs) and n- = ^(pzp's + psp'z) ■ 

pairs pairs 

(28) 

Note that if N denotes the number of galaxy pairs in a given 
distance bin, then n+ +n_ ^ N. Consequently, this reduces 
the "effective" number of galaxy pairs in a given distance 
bin because the contribution of every galaxy pair is down- 
weighted by the probability that either galaxy is not a spiral 
with handedness. Furthermore, reducing the effective num- 
ber of galaxy pairs also increases the error of the correlation 
estimate through the beta distribution (e.g. Cameron 2011). 
Results of this estimator are shown in panel (b) of Fig. 4. 
As expected, the errorbars are indeed slightly larger. 

Third, wo account for redshift errors but ignore clas- 
sification uncertainties. As described in Sect. 5.5, we draw 
1,000 Monte-Carlo realisation from the error distributions 
of the spectroscopic redshift estimates and average over all 
realisations. Panel (c) of Fig. 4 shows the resulting estimate 
of the handedness autocorrelation. In comparison to panels 
(a) and (b), the autocorrelation function now looks remark- 
ably smooth. Errors in redshift cause uncertainties in the 
distances, i.e., galaxy pairs end up in different distance bins 



in different realisations. Consequently, a likely explanation 
for all the substructures in panels (a) and (b) is that they 
are noise features that have been enhanced by binning. 

Finally, panel (d) shows the marginal autocorrelation 
function, which takes into account all important sources 
of uncertainty. The errorbars are so large that apparently 
no statistically significant positive correlation of handedness 
cam be detected. However, we have to refine this question in 
the next section, as we should not attempt to assess statis- 
tical significance from binned data. 

5.8 Parameter estimation 

Figure 4 shows binned versions of the estimated correlation 
function. This is acceptable as long as we only study the 
dependence of the errorbars on the different error sources. 
However, in order to assess the statistical significance of 
positive autocorrelations in the final marginal estimate, we 
should try to avoid the ambiguities introduced by binning. 
For this purpose, we employ the likelihood function of the 
data D introduced by Slosar et al. (2009), 

mar)] = n (i+^l^) , (29) 

pairs p 

where Vp is the distance between the two galaxies of the p-th 
pair. The coefficient dp is the handedness product of both 
galaxies. As Slosar et al. (2009) used hard cuts of the Galaxy 
Zoo classifications, dp — ±1 in their case. Wc modify this 
by equating dp with Eq. (27) such that now — 1 ^ dp ^ -1-1 
and galaxy pairs are weighted by the probability that both 
of them are spirals. 

In order to assess the statistical significance of potential 
positive autocorrelations in spiral-arm handedness, we follow 
Slosar et al. (2009) in using the Bayes factor, 

prob(-D|7W+) _ / prohiD\e+,M+)pToh{6+\M+)de+ 
pToh{D\Mo) ~ Jpioh{D\0o,Mo)pioh{eo\Mo)deo 

(30) 

Here, proh{D\Mn) denotes the likelihood of the data D 
given the model A4n, irrespective of what the parameter 
values 6n of model A4n are. Conversely, proh{D\9n, A4n) 
denotes the likelihood of the data given the model Mn and 
certain parameter values On, while proh{9n\Mn) denotes the 
prior probability of the parameter values On of model Mn-^^ 
In our case, the model Mo describes the null hypothe- 
sis that no autocorrelation exists, i.e., ^(r) = 0. This model 
has no free parameters, such that we can directly evalu- 
ate prob(£)| Aid) via Eq. (29). Conversely, the model M+ 
is supposed to describe positive autocorrelations. Here, we 
have to make a choice how we parametrise such positive au- 
tocorrelations. Like Slosar et al. (2009), we then employ two 
parametrisations, an exponential and a Gaussian, 

$exp(r) = ae"''/'' and ^GauBB(r) = oe"'' , (31) 

with model amplitudes a and model correlation lengths b. 
For both models, we use flat and normalised priors within 

11 If we assume that both models, A1+ and Mo, arc equally 
likely a-priorl, i.e., if we have no a-prlori preference, then the 
Bayes fa<;tor is identical to the ratio of model posteriors, which 
quantify the probabiUty of the model given the data. 
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Figure 4. Impact of various errors on estimates of the handedness autocorrelation function. Panel (a): Hard estimate neglecting classifi- 
cation uncertainties and redshift errors, taking into account only number statistics. Panel (b): Soft estimate accounting for classification 
uncertainties and number statistics, neglecting redshift errors. Panel (c): Estimate accounting for redshift errors and number statistics, 
ignoring classification uncertainties. Panel (d): Marginal estimate taking into account classification uncertainties, redshift errors and 
number statistics. Furthermore, we show autocorrelation estimates parametrised as exponential and Gaussian a<;cording to Fig. 5. 
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Figure 5. Likelihood contours constraining a-b plane of expo- 
nential (left) and Gaussian (right) handedness-autocorrelation 

functions. The likelihood maximum is indicated by a cross and 
the contours enclose 68.3%, 95.5%, and 99.7% confidence. The 
maximum of the exponential model occurs at a = 0.38 and 
b = 0.48Mpc//i. The maximum of the Gaussian model occurs 
at a = 0.18 and 6 = 0.66Mpc/h. 



the intervals a € (0, 1] and h € (0,3]. In conticist to Slosar 
et al. (2009), we also exclude a = in order to ensure 
that M.+ and Mo are indeed mutually exclusive. As both 
parametrisations introduced in Eq. (31) have two free pa- 
rameters, a and 6, wc cannot evaluate M+ directly. Rather, 
we compute the likelihood manifolds of a and h for both 
models using a brute-force grid. Figure 5 shows the result- 
ing likelihood manifolds averaged over the 1,000 noise re- 
alisations drawn from the redshift-error distribution. Our 
results look very similar to those shown in Fig. 4 of Slosar 
et al. (2009) and our most likely values agree nicely with 
their values. The best-fit estimates for both models are also 
shown in panel (d) of Fig. 4. Given the brute-force likelihood 
grid Cij = C{ai,bj), the marginalisation integral in Eq. (30) 
can be approximated by a Riemann sum. 



^ £(ai,6j)AoiA6j-^ , 



(32) 



[ da [ 
Jo Jo 



dbpioh{D\a, b,M+) prob(a, b\M^ 



where prob(o, 6|A^+) = | is the normalised flat prior on 
the interval a € (0,1] and b G (0,3], while Aoi = Aa and 
Abj = Ab denote the equidistant stepsizes of the brute-force 
likelihood grids shown in Fig. 5. This results in Bayes factors 
of 27.9 for the exponential model^^ and 13.1 for the Gaus- 
sian model, respectively. These values can be interpreted as 
strong but not yet decisive evidence in favour of positive 
autocorrelations. Decisive evidence requires Bayes factors 
larger than 100 (e.g. Kass & Raftery 1993). 



5.9 Impact on autocorrelation of 

angular-momentum orientation 

Now, we discern the impact of various error sources on esti- 
mates of the autocorrelation function of angular-momentum 
orientation vectors. Again, our ultimate goal is a marginal 
estimate of the handedness autocorrelation function. 

First, we try to reproduce the estimate of angular- 
momentum-orientation autocorrelation of Lee (2011). The 
only difference is that we have removed 20 objects from the 
galaxy sample in order to eliminate rogue pairs. Panel (a) 
of Fig. 6 shows our resulting estimate of the autocorrelation 
via Eq. (18). Our result is identical to the one of Lee (2011). 
This implies that, first, our method is working correctly, and, 
second, that a few rogue pairs have negligible impact on the 
results of Lee (2011). 

Second, wc study the impact of uncertainties of mor- 
phological classification. Formally, the estimator defined in 
Eq. (18) does not change, only the effective number of galaxy 



This means that it is 27.9 times more likely that the data has 
been drawn from an exponential whose amplitude is somewhere in 
the range (0, 1] and scale radius is somewhere in (0, 3]Mpc/?i than 
that the data has been drawn from a zero correlation function. 
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pairs in all redshift bins is reduced. Picking out a single of 
the four terms in Eq. (18), we change the definition 



EpairsPScdPfcdV^Pal-ga " L'a\ 

E„HC „HC ' 
pairs /'Scd^'Scd 



(33) 



This weights the contribution of every pair by the proba- 
bility PscdPscS that both galaxies are Scd galaxies. Further- 
more, the number A'^ of pairs in the distance bin are replaced 
by the sum of weights Epairs PscdPscd' ^ Obviously, this 
weighting also affects the error estimate of Eq. (19). Panel 
(b) of Fig. 6 shows the probabilistic correlation estimate. 
Evidently, the hard estimator used by Lee (2011) substan- 
tially underestimates the errors, thereby overestimating the 
actual statistical significance. As class probabilities were cut 
a^t Pscd > 0.5, classification uncertainties have a larger im- 
pact than in the case of handedness where the cut of hand- 
edness probabilities was at 0.8. 

Third, we incorporate errors in spectroscopic redshift by 
drawing 1,000 Monte-Carlo realisations from the redshift's 
error distribution. The resulting conditional estimate, now 
out to 10Mpc//i, is shown in panel (c) of Fig. 6. Qualita- 
tively, the impact of redshift errors on the correlation es- 
timate of angular-momentum-orientation vectors is not as 
severe as in the case of handedness (cf. marginal estimate 
of Fig. 4). Note, the binsize in Fig. 6 is much larger than in 
Fig. 4 because here we are studying a smaller sample with 
fewer galaxy pairs. Nonetheless, the estimated errors have 
indeed increased, which is particularly obvious for the first 
distance bin. As the binning is logarithmic in distance, this is 
not surprising because the first distance bin has the smallest 
binsize and is thereby strongest affected by redshift errors 
"smearing out" galaxy pairs along the horizontal axis.^^ 

Finally, we also take into account errors in ellipticity 
estimates. As mentioned in Sect. 5.4, the SDSS database 
actually does not provide error estimates for the isophotal 
ellipticities. Hence, we need to proceed using the rough er- 
ror estimates of Eqs. (24) and (25) as well as the uniform 
error in intrinsic axis ratios. This enables us to estimate a 
marginal autocorrelation function which is shown in panel 
(d) of Fig. 6. In comparison to panel (c), there is only a mi- 
nor increase in the errorbars. However, we would not put too 
much faith into the marginal estimate because the error esti- 
mate of ellipticities is rather coarse. Nevertheless, comparing 
to panel (a), the marginal estimate differs substantially from 
a conditional estimate and there are no statistically signifi- 
cant autocorrelations. 



5.10 Constraining theoretical pctrameters 

The autocorrelation of angular-momentum orientations can 
be used to estimate free parameters in the tidal-torque the- 
ory (e.g. Lee & Pen 2008). Let ^{r,R) denote the two-point 
correlation function of the density field, smoothed over scale 
R. In this case, one can derive a model prediction for the 
linear regime (e.g. Pen et al. 2000) 



eLL(r) 



a' eir,R) 
6 e{r,0) 



(34) 



We do not expect distance errors of the order of 0.2Mpc/h to 
have a large impact on a distance bin of IMpc/h binsize. 
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Figure 7. Marginal likelihoods of fitting the marginal angular- 
momentum-orientation autocorrelation. The model given by 

Eq. (36) is fitted to the binned version of the marginal autocor- 
relation of Fig. 6d. Top panel: Marginal likelihood of amplitude 
with maximum at A = 0.0034^Q QQ2y. Centre panel: Marginal 
likelihood of correlation length with maximum at R = 2.5^2 3 • 
Bottom panel: Marginal likelihood of exponent with maximum at 
C = 0.7llg;38. The asymmetric errors denote 68% confidence in- 
tervals. The parameter estimation has been conducted on a three- 
dimensional brute-force grid. As the distributions of Monte-Carlo 
realisations in every distance bin are Gaussian in excellent ap- 
proximation, the fit is done via x^-uiinimisation. 



where a is a free model parameter. For the nonlinear regime, 
Lee & Pen (2008) derived the following model prediction 

al e{r,R) 



eLL(r) 



e(r, 7?) 



(35) 



6 e(r,0) '^^ar,0) ' 

where and £nl are free model parameters describing the 
linear and nonlinear contributions. Estimating values for 
these model parameters is important in order to constrain 
the tidal- torque theory. The impact of the additional error 
sources on this parameter estimation is devastating. First, 
the marginal estimate of ^LL(r) has large errors. Second, er- 
rors in redshift estimates and morphological classification'^^ 
also affect the estimation of the two-point correlation func- 
tion ^(r, _R). Given these considerations and the SDSS sam- 
ple, we have to conclude that it is currently impossible to 
place decisive constraints on the theoretical parameters. 

The same argument applies to the generic autocorrela- 
tion model proposed by Schafer & Merkel (2011), 



CLL(r) = ^exp 



R 



(36) 



^'^ In fact, this is the reason why Lee (2011) restricts the sample to 
galaxies with z ^ 0.02 in order to obtain a volume-limited sample. 
Otherwise, the density field of galaxies cannot be meaningfully 
defined and ^(r, R) cannot be estimated. 

As Cll('') has been estimated for Scd galaxies, also ^(r, R) has 
to be estimated for this type of galaxies. 
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Figure 6. Impact of various errors on estimates of autocorrelation function of angular-momentum-orientation vectors. Panel (a): Hard 
estimator, neglecting classification uncertainties and errors in redshift and ellipticity estimates, taking into account only number statistics. 
Panel (b): Soft estimator, accounting for classification uncertainties and number statistics, neglecting errors in redshift and ellipticity 
estimates. Panel (c): Conditional estimate accounting for redshift errors, classification uncertainties and number statistics, neglecting 
errors in ellipticity estimates. Panel (d): Marginal estimate taJfing into a<;count classification uncertainties, number statistics, and errors 
in redshift and ellipticity estimates. The solid line represents the fit given by Eq. (37). 



which contains a linear amplitude A and two nonlinear 
model parameters R and C that cannot be constrained prop- 
erly. Figure 7 demonstrates this by showing the marginal 
likelihoods of fitting Eq. (36) to the binned data of panel 
(d) of Fig. 6.^® Evidently, the (marginal) uncertainties in 
all model parameters are extremely largo. Nevertheless, let 
us note that the correlation length of lMpc//i predicted by 
Schafor & Merkel (2011) is in agreement with our estimate. 
Furthermore, for later purposes, we identify the best fitting 
model,^^ 



&L(r) « 0.026 -exp 



0.34Mpc//i 



(37) 



Wo explicitly emphasise that we do not claim that this were 
by any moans a model of the true correlation function. This 
fit is solely meant to provide us with some model that is 
compatible with the data. Such a model is later required in 
order to conduct simulations. This is also the reason why we 
do not need to estimate errors for the fit given by Eq. (37). 



6 BIASED ELLIPTICITY ESTIMATES FROM 
SECOND MOMENTS 

Isophotal ellipticity estimates have the disadvantage that 
they strongly depend on the choice of a particular isophote 
and therefore may suffer strongly from pixel noise. Elliptic- 
ity estimates based on the moments of the galaxy's light 
distribution at first glance seem to be more promising, since 

Actually, we should estimate the correlation function from un- 
binned data like in Sect. 5.8. However, a meaningful likelihood 
function is not easily defined in this case such that we have to re- 
sort to fitting binned data. We are fully aware that binning may 
compromise our assessment of statistical significance. 

Note that the maximum of the joint Ukelihood does not coin- 
cide with the maxima of the marginalised likelihoods in Fig. 7. 



no isophote is required and the complete data enters the 
estimate. Consequently, we would expect that ellipticity es- 
timates based on light moments arc more robust against 
pixel noise than isophotal ollipticitios which might improve 
autocorrelation estimates of angular-momentum-orientation 
vectors. However, in this section, we demonstrate that ellip- 
ticity estimates based on second moments of the light dis- 
tribution are so strongly biased that they cannot be used 
for investigations of disc alignment. In particular, this bias 
would cause us to overestimate the correlation due to align- 
ment such that, e.g., we would overestimate its impact on 
weak-lensing studies. 



6.1 Revealing the bias 

We also assess the usage of ellipticity estimates based on un- 
weighted second moments of the galaxies' light distributions. 
Furthermore, SDSS offers error estimates for these param- 
eters. Figure 8 shows the result. The most striking differ- 
ence to Fig. 6d is that Fig. 8 exhibits correlations that are 
substantially larger. This difference stems from systematic 
differences in the axis ratios resulting from second moments 
and isophotal contours, which is shown in Fig. 9. Evidently, 
axis ratios estimated from second moments are systemati- 
cally larger than isophotal axis ratios while orientation an- 
gles are unbiased. This implies that in Fig. 8 galaxies are 
generally considered to be rounder than they actually are, 
i.e., the inclination angle is misestimated. Given the for- 
malism of Lee (2011), this bias bents the estimated angular- 
momentum-orientation vectors into the line-of-sight, thereby 
feigning these strong correlations. Our scepticism is further 
raised by the enormous statistical significance of the corre- 
lations, which still seems to hold at separations as large as 
lOMpc/h. Finally, we note that the background correlation 
estimated firom randomly shuffling the galaxy positions in 
the sample (cf. Lee 2011) is not zero. This suggests the pres- 
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Figure 8. Pseudo-marginal estimate of the angular-momentum- 
orientation autocorrelation for the sample of Scd galaxies taking 
into account uncertainties in classification, number statistics, er- 
rors in rcdshift and ellipticity estimates. Results have been aver- 
aged over 1,000 Monte-Carlo samples drawn from the error distri- 
bution of spectroscopic redshifts. The dots indicate mean values 
and the errorbars correspond to one Gaussian standard deviation. 
The horizontal dashed line indicates the background correlation 
level estimated from 100 random shufflings of the galaxy posi- 
tions, a method used by Lee (2011). 




).0 0.2 0.4 0.6 0.8 1.0 

Qiso 



50 100 150 



Figure 9. Comparing ellipticities based on isophotes and un- 
weighted second moments. Top panels: Axis ratios (left) and ori- 
entation angles (right) for Scd galaxies. Bottom panels show the 
same for Sab galaxies. Axis ratios estimated from second mo- 
ments are systematically larger than those estimated from isopho- 
tal contours, i.e., second moments find the disc galaxies to be 
rounder. Orientation angles arc unbiased. The distributions of 
axis ratios for Scd and Sab galaxies agrees with the results of 
Huertas-Company et al. (2011) (their Fig. 2). 



ence of a strong bias, corrupting the correlation estimate of 
Fig. 8. 



6.2 Point-spread function 

Is this bias an effect of the point-spread function (PSF) 
which makes galaxies look rounder than they actually are? 
This is unlikely because all our objects are large compared to 
the size of the PSF. The median r-band Petrosian radius of 



Figure 10. Impact of circular Gaussian PSF with Petrosian ra- 
dius of 1.3 pixel onto convolved axis ratios Qcon and orientation 
angles Scon of exponential-disc profiles with Petrosian radii of 15.8 
pixels and intrinsic axis ratios 0.1 C (Jint C 1 and orientation an- 
gles 9int = 30°. All profiles have been truncated at five scale radii. 
There was no noise in this simulation. The PSF leads to an over- 
estimation of the axis ratios by at most 1.2% for highly elongated 
objects. As the PSF was circular in this test, orientation angles 
are not affected. 



the 4,211 Scd galaxies with SDSS data is 15.8 pixel, whereas 
the r-band Petrosian radius of the SDSS PSF is approx- 
imately 1.3 pixel.^*^ Consequently, the impact of the PSF 
should be small. This expectation is supported by Fig. 10, 
where wo simulate the impact of a Gaussian PSF with Pet- 
rosian radius 1.3 pixel onto exponential-disc profiles with 
Petrosian radii of 15.8 pixel and dilferent intrinsic axis ra- 
tios. We find a maximum overestimation of axis ratios of 
only 1.2%, which is not enough to explain the strong bias in 
Fig. 8 or the discrepancy in Fig. 9. 



6.3 The origin of the bias: Galactic bulges 

We are now going to argue that the heavily biased correla- 
tion estimate of Fig. 8 stems from the galactic bulges biasing 
the second moments and thereby the ellipticity estimates. 
At first glance, this may seem to be a rather unlikely ex- 
planation, since we explicitly selected only Scd galaxies in 
order to minimise the impact of galactic bulges. However, 
this hypothesis can explain the substantial discrepancy be- 
tween isophotal axis ratios and axis ratios based on second 
moments revealed by Fig. 9. If bulges were an issue, they 
would afi'ect the second moments and would lead us to over- 
estimate axis ratios, since bulges are in any case "roundish". 
On the other hand, isophotal ellipticity estimates should be 
unalfcctcd by the presence of bulges as long as the isophotc 
used is in the disc component. In fact, Bernstein (2010) dis- 
cuss this issue in the context of shear measurements in weak 
lensing. We demonstrate that the presence of a bulge can 
bias the estimate of axis ratio based on second moments. 
For this purpose, we perform a bulge-disc decomposition of 



The r-band Petrosian radius of the SDSS PSF has been es- 
timated as the median r-band Petrosian radius of 100,000 stars 
downloaded from the SDSS database. 
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Figure 11. Bulge-disc decomposition of an example Scd galaxy 
((/-band). The bulge is a circular deVaucoulcur profile, while the 
disc component is an exponential profile with ellipticity. The bulge 
is pinned to the pixel of the peak-of-light whereas the centroid of 
the disc component is free. Panel (a) shows the original galaxy. 
Panel (b) is the disc component, while panel (c) is the bulge com- 
ponent. Panel (d) displays the fit residuals. The fit was performed 
by -minimisation using a Simplex algorithm (Nelder & Mead 
1965) and reached a minimum value of 3.18 per pixel. 
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a prototypical Scd galaxy from our data sample, which is 
shown in Fig. 11. Indeed, the axis ratio estimated from the 
second moments of the complete model (including bulge) is 
Qh+d ~ 0.48, whereas the axis ratio used by the disc model 
is only gdisc ~ 0.38.^'' Wc conclude that the bulge is well 
capable of biasing the ellipticity estimate substantially, even 
in the case of Scd galaxies. 

As another test for our hypothesis to pass, wc compare 
the axis ratios based on isophotes and second moments for 
Sab galaxies from the sample of Huertas-Company ct al. 
(2011). As Sab galajcies have more prominent bulges than 
Scd galaxies, we would expect a stronger bias than in Fig. 9. 
We select all galaxies with psab ^ 0.8 and download the 
r-band Stokes parameters from the SDSS database, if avail- 
able. For the resulting 8,496 Sab galaxies. Fig. 9 also shows 
the comparison of ellipticities estimated from isophotes and 
second moments. Evidently, the second moments are biased, 
too, and the bias is also more pronounced than for Scd galax- 
ies. This confirms our expectation. 

From our hypothesis of bulges biasing second moments, 
we can deduce the following prediction: If galactic bulges 
indeed bias second moments such that estimated angular- 
momcntum-orientation vectors arc bent into the line of sight, 
the angular correlation function should exhibit a hias of the 
form 



6(6)) = A + Bcos^e, 



(38) 



where 6 now denotes the angular separation of two galax- 
ies. ^° The reason is that due to the bending of orientation 
vectors, the scalar product L • L' is on average equal to the 
cosine of the two galaxies' separation angle. This prediction 
is confirmed by Fig. 12 which strongly suggests that ^ll(5) 



The g-band axis ratio noted in the SDSS database for this 
example galaxy is giso ~ 0.41 estimated from isophotes and 
9mom ~ 0.63 estimated from second moments (Stokes parame- 
ters). The discrepancy of axis ratios from the SDSS database and 
the bulge-disc decomposition is the consequence of a non-optimal 
model. 

■^^ The parameter values A and B depend on the details of the 
bias caused by the galactic bulges and are not generally pre- 
dictable. 



Figure 12. Comparing autocorrelations of angular-momentum- 
orientation vectors in angular separation for ellipticity estimates 
based on second moments (top) and isophotes (bottom). The bias 
model of Eq. (38) with Icr errors is shown in the top panel. 



is dominated by this bias. This suspect behaviour is also 

exhibited by the autocorrelation function in real space, as 
shown in the top panel of Fig. 13. Figure 12 also shows that 
when using isophotal ellipticity estimates, ^ll(^) does not 
exhibit such a bias.^'^ 

Is it possible to debias the autocorrelation function by 
subtracting Eq. (38) from all pairwisc projections of angular- 
momcntum-orientation vectors? We investigate this question 
in Fig. 13, where we show the biased and debiased autocor- 
relation function. Indeed, the debiased autocorrelation func- 
tion looks very promising. For later modelling purposes, we 
parametrise the debiased autocorrelation function by 



^LL(r) w (0.013 -I- 0.002r - 0.00036r^) exp 



6.1Mpc//i 



(39) 

where no error estimate is required because we only use this 
fit as input in simulations. Is the debiased autocorrelation 
function trustworthy? For comparison, Fig. 13 also shows the 
unbiased autocorrelation function based on isophotal cllip- 
ticites. Evidently, the debiased and isophotal autocorrelation 
functions do not agree. However, this does not necessarily 
rule out the debiased autocorrelation function because we 
actually expect that ellipticity estimates based on second 
moments are less noisy than isophotal ellipticity estimates 
since they use the whole light distribution instead of a single 
isophote. Hence, it is not a^priori implausible that the de- 
biased autocorrelation function exhibits more information 
than the isphotal autocorrelation function. 

In order to assess the trustworthiness of the debi- 
ased autocorrelation estimate, we conduct the following self- 
consistency test: We take the original galaxies as in Fig. 13, 
maintaining their true spatial positions, but when estimating 



Note that the angular correlation estimate in Fig. 12 looks 
worse than the spatial correlation estimate of Fig. 6d. This is 
due to the fact that the angular correlation function does not use 
distance information. 
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Figure 13. Debiasing the autocorrelation function of angular- 
momentum-orientation vectors. Top panel: The biased auto- 
correlation function based on ellipticity estimates from second 
moments. Middle panel: "Debiased" correlation function where 
Eq. (38) has been subtracted from all pairwise projections. The 
solid orange line is the fit given by Eq. (39). Bottom panel: Au- 
tocorrelation function based on isophotal cUipticities. 
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the autocorrelation function, we replace the actual angular- 
momentum-orientation vectors by simulated vectors which 
exhibit the correlation function given by Eq. (39). This sim- 
ulation is described in Appendix A2. Panel (a) of Fig. 14 
validates our simulation method. We then simulate the bias 
of second moments. For every galaxy, we take the simulated 
angular-momentum-orientation vector and infer the actual 
axis ratio gtrue from it. Motivated by the top left panel of 
Fig. 9, we then replace the true axis ratio by an "overes- 
timate" drawn from the uniform distribution over the in- 
terval [gtruG,!]- Using this biased axis ratio, we recompute 
the angular-momentum-orientation vector and estimate the 
correlations. As shown in panel (b) of Fig. 14, the resulting 
biased autocorrelation function closely resembles the obser- 
vation from Fig. 13. For debiasing, we then also estimate 
the autocorrelation in angular space, as shown in panel (c) 
of Fig. 14. Indeed, the estimate is dominated by a bias of 
the form of Eq. (39), i.e., our bias simulation is realistic. We 
then estimate the debiased autocorrelation function, which 
is shown in panel (d). Evidently, the debiased result exhibits 
systematic and significant deviations from the input auto- 
correlation function. We emphasise that the debiased result 
is not an obscured version of the input correlation function. 
Neither their difference nor their ratio is a constant, i.e., the 
debiasing was not successful. Consequently, the debiasing is 
not self-consistent and the debiased autocorrelation estimate 
shown in Fig. 13 is not trustworthy. 



Figure 14. Self-consistency test of debiasing the autocorrelation 
function. Panel (a): The input autocorrelation function as given 
by Eq. (39), validating our simulation technique. Panel (b): The 
biased autocorrelation function. Panel (c): The debiasing of the 
autocorrelation function in angular space. Panel (d): The debi- 
ased autocorrelation function, which exhibits significant devia- 
tions from the input. 

7 IMPROVEMENTS AND POTENTIAL OF 
FUTURE SURVEYS 

We showed in Figs. 4d and 6d that with current data 
there are no statistically significant autocorrelations. What 
can be done to improve these results? In this section, 
we briefly elaborate on improvements of ellipticity esti- 
mates and the potential of future sky surveys, namely 
PanSTARRS, LSST and EUCLID, to enhance the estimates 
of handedness and angular-momentum-orientation autocor- 
relations. We discuss the impact of number statistics and 
improvements of redshift estimates. We also discuss mor- 
phological classification and estimation of front-edges of disc 
galaxies. 

T.l Improving ellipticity estimates 

We demonstrated in Sect. 6 that ellipticity estimates based 
on second moments are strongly biased by galactic bulges 
even for Scd galaxies. In fact, Fig. 12 suggests that corre- 
lation estimates based on second moments are completely 
dominated by this bias which overwrites the desired astro- 
physical signal. Therefore, we conclude that ellipticity esti- 
mates based on second moments overestimate axis ratios and 
thereby corrupt estimates of angular-momentum-orientation 
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autocorrelation. This bias also corrupts similar correlation 
estimates, such as ellipticity autocorrelations (e.g. Blazek 
et al. 2011), leading us to overestimate the impact of disc 
alignment on weak-lensing studies. What are alternative el- 
lipticity estimators? This same bias also applies to adaptive 
moments (Bernstein & Jarvis 2002; Hirata & Seljak 2003) in 
this context. Furthermore, model-based ellipticity estimates 
are problematic, since nearby disc galaxies usually exhibit 
rich azimuthal structures, which are virtually impossible to 
model faithfully. The only kind of model designed to describe 
such rich azimuthal structure are basis-function expansions 
(e.g. Massey & Refregier 2005; Ngan et al. 2009), which 
unfortunately suffer from other severe conceptual problems 
(Melchior et al. 2010; Andrae et al. 2011). We have to con- 
clude that isophotal ellipticities - though relying on a some- 
what arbitrarily chosen isophote^^ - are the only useful el- 
lipticity estimates for investigations of angular-momentum- 
orientation autocorrelation, since they are closest to the disc 
ellipticity. 

There is yet another serious conceptual issue we have to 
face. In the weak-lensing context galaxies are usually rather 
small with radii of a few pixels only. In our case, however, 
we are considering large extended disc galaxies. Disc galax- 
ies usually exhibit substructures such as galactic bars, rings 
or star-forming regions. In particular, the Scd galaxies con- 
sidered by Lee (2011) and in this work typically exhibit very 
open spiral-arm patterns. For such objects there are consid- 
erable ellipticity gradients (Bernstein 2010) and "disc ellip- 
ticity" is not a well defined concept anymore. Therefore, it 
may be helpful to estimate ellipticities in the near infrared 
regime, where, e.g., star-forming regions are not as promi- 
nent as in the optical regime such that disc galaxies look 
smoother. 
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Figure 15. Impact of number statistics on the errors of the in- 
nermost three distance bins in the marginal handedness autocor- 
relation function. The x-axis shows the fraction of galaxy pairs 
selected from all pairs, which is equivalent to a survey covering 
the same fraction of the total survey area. Both axes are in loga- 
rithmic scale, i.e., the dependence of the errors is approximately 
a power law for all three bins. The dashed line indicates a power 
law of Af"^/^, where N is the number of pairs in every bin. 



an extension from SDSS to full-sky coverage with SDSS qual- 
ity would increase the database approximately threefold (the 
Milky Way obscures roughly one quarter of the sky) and 
thereby would decrease the errors by a factor of y/Z ~ 1.7. 
Given the results of Figs. 4d and 6d, this would clearly be 
a major break-through in the measurability of potential au- 
tocorrelations. 



7.2 Improving number statistics 

An obvious strategy to improve estimates of handedness 
or angular-momentum-orientation autocorrelations is to in- 
crease the number of galaxies in the data sample. For in- 
stances, SDSS and thereby Galaxy Zoo cover approximately 
one quarter of the full sky. How would an extension to an 
(extragalactic) all-sky survey improve the autocorrelation 
estimates? If we assume identical depth, this areal exten- 
sion leaves the galaxy density unchanged, it only increases 
the number of galaxy pairs in all distance bins. 

In order to study the improvement of an enlarged sur- 
vey area, we randomly draw subsamples from the Galaxy 
Zoo database (a larger database is not available, so we 
use smaller databases) and estimate their handedness au- 
tocorrelations. In fact, we do not draw the subsamples from 
the database itself, which would correspond to reducing the 
galaxy density. Instead, we randomly draw the subsamples 
from the list of galaxy pairs. Figure 15 clearly shows that the 
errors in the handedness autocorrelation function are indeed 
dominated by number statistics, since the errors depend on 
sample size with a power law of exponent — | . Consequently, 



The SDSS pipeline uses the 25 magnitudes per square arcsec- 
ond isophote. 

http:/ /www.sdss.org/dr6/algorithms/classify.html#photo_stokes 



7.3 Improving redshift estimates 

Reducing the errors in spectroscopic redshift estimates 
would clearly help in order to reduce the errors in the au- 
tocorrelation functions. For instances, the redshift error of 
o-^ = 7.8 • 10"^ at 2 = 6.5993 • 10"^ quoted in Fig. 2 
corresponds to an error in the radial-velocity estimate of 
~ (i+z)^ ~ 20.6km/s. However, given the typical velocity 
dispersion of galaxies in small groups of (202 ± 10)km/s and 
in large clusters of (854± 102)km/s (Becker et al. 2007), the 
spectroscopic redshift estimates of SDSS are already picking 
up peculiar motions of individual galaxies instead of cosmo- 
logical expansion. Consequently, further improving the ac- 
curacy of spectroscopic redshifts cannot improve estimates 
of, e.g., the handedness autocorrelation function. 

Given the impact of uncertainties in spectroscopic red- 
shift estimates on, e.g., the handedness autocorrelation func- 
tion, it is obvious that photometnc redshift estimates can- 
not help to improve the situation. Typically, uncertainties 
in photometric redshift estimates are two orders of magni- 
tudes larger than uncertainties in spectroscopic redshift es- 
timates. Considering Fig. 2, this would lead to an error in 
the comoving distance of several tens of Mpc/h. Moreover, 
though there are many more galaxies with photometric red- 
shift estimates than galaxies with spectroscopic redshift es- 
timates (typically at least one order of magnitude), these ad- 
ditional objects are typically also much fainter because selec- 
tion for spectroscopic observations is usually triggered by the 



Only marginal alignment of disc galaxies 17 



galaxy's brightness. The faintness of these additional objects 
would therefore also complicate the morphological classifica- 
tion. For a disc galaxy, the fainter the object, the more diffi- 
cult it is to identify the disc. Consequently, surveys that offer 
only photometric but no spectroscopic redshift estimates are 
of no use to estimate these autocorrelation functions. This 
essentially rules out PanSTARRS and LSST. Conversely, 
the EUCLID survey will gather of the order of 100 million 
spectroscopic redshifts of galaxies. Unfortunately, the galaxy 
sample observed by EUCLID will have redshifts between 0.5 
and 2. As was shown by Crittenden et al. (2001), estimates 
of handedness and amgular-momentum-orientation correla- 
tions are compromised by weak-lensing signals for z > 0.3. 



7.4 Morphological classification in future surveys 

Evidently, autocorrelation estimates of handedness and 
angular-momentum orientation require morphological clas- 
sification in future surveys. As we cannot probe high-redshift 
galaxies for this purpose, the morphological classes used by 
Galaxy Zoo or Huertas-Company et al. (2011) are sufficient 
and no further diversification is necessary. In particular, this 
implies that we can build on these two morphological cata- 
logues to classify galaxies in future surveys: First, we match 
for the galaxies of known morphological types in the new 
survey. Second, we use the new survey's imaging or spectro- 
scopic data to estimate those galaxy's parameters. Finally, 
using these parameters and the galaxies of known morpho- 
logical types as a training sample, we can set up a proba- 
bilistic classification algorithm to extend this classification 
scheme to the new survey catalogue. In fact, this is precisely 
the same exercise as Huertas-Company et al. (2011) did, but 
on much larger scale. In particular, the Galaxy Zoo sample 
with approximately 900,000 visually classified galaxies out to 
redshift z ~ 0.5 would provide an extremely valuable train- 
ing sample. Gauci et al. (2010) demonstrated that modern 
classification algorithms perform excellently in reproducing 
the visual classifications of the Galaxy Zoo sample. This 
strategy has several advantages: It is easily conductable, it 
does not require much computational time, and it is highly 
accurate and objective. 



7.5 Front-edge estimation 

With so little information in the data, using additional in- 
formation can be very helpful. Such additional information 
is provided by an estimate of the disc's front edge, i.e., which 
edge of the semi-minor axis is pointing towards us. If we can 
estimate the front-edge, we can use the results as weights Pa 
and pb in the correlation estimator of Eq. (18). Evidently, if 
we knew the front edge of every galaxy in our data sample, 
this would break the geometric degeneracy in the angular- 
momentum-orientation vector and thereby would improve 
the correlation estimate. 



7.5.1 Visual classification 

We estimate the front-edge by looking for dust extinction, 
in particular dust lanes. We visually inspect ^-band images. 



since of all five SDSS bands this band is most strongly af- 
fected by dust extinction while still being of decent depth. 
The outcome of such a visual inspection is as follows: 

• Equal weights pa = P6 = | if we are uncertain. 

• Weight of 0.6 to indicate a somewhat uncertain trend. 

• Weight of 0.9 if we believe to be certain. 

We do not assign a weight of 1 in the last case, since there 
is always some uncertainty. By construction, this method 
works best for edge-on discs, since face-on discs do not 
display dust lanes. Unfortunately, knowing the front-edge 
would have a larger impact for nearly face-on discs than for 
edge-on discs (sec definitions in Lee 2011). We visually in- 
spected g-band images of the 500 largest galaxies, sorted by 
their Petrosian radii. For smaller galaxies, the resolution is 
not good enough to identify dust lanes. Unfortunately, we 
find only very few decisive front-edge classifications, namely 
40 Scd galaxies with certain front-edge classifications and 
39 with somewhat uncertain front-edge. Consequently, we 
find no substantial improvement of the marginal correlation 
estimate. Nevertheless, future sky surveys may have an im- 
proved imaging quality, such that a visual front-edge classi- 
fication is possible for more objects. 

7.5.2 Automated classification 

It is definitely beneficial to obtain a front-edge classification 
for galaxies with intermediate inclinations, since the rounder 
the object the larger the information gain. Unfortunately, 
visual classification via dust lanes is restricted to highly in- 
clined discs. Therefore, the front edge needs to be inferred 
in a different way, which should ideally be fully automated 
in order to ensure objectiveness. One potential approach is 
front-edge classification via colour gradients from dust ex- 
tinction. However, this requires highly accurate photometric 
positions. In simple tests, we experienced that already co- 
ordinate offsets between the different bands of a tenth of a 
pixel along the semi-minor axis can compromise such esti- 
mates. Another approach is front-edge classification via dust 
extinction in single-band photometry. In the case of SDSS, 
this would ideally be the g-band, where the impact of dust 
extinction is larger than in r,i,z whereas the g-band is not 
as shallow as the w-band. This approach would compare the 
fiuxes above and below the major axis, thereby estimating 
the front edge. In contrast to colour-based methods, this 
approach does not rely on accurate photometric positions. 
However, like any automated method for front-edge clas- 
sification, it suffers from several other effects such as star- 
forming regions in the galaxy or foreground stars which com- 
promise colour gradients and flux differences. These effects 
are the major obstacles which have to be overcome in order 
to set up a reliable front-edge classification algorithm. 



8 DISCUSSION AND CONCLUSIONS 

We have shown that when all relevant error sources are taken 
into account, there are no statistically significant autocor- 
relations, neither of spiral-arm handedness nor of angular- 
momentum-orientation vectors of Scd galaxies. Previous es- 
timates (Slosar et al. 2009; Lee 2011) did not account for 
these error sources and therefore are conditional estimates 
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that underestimated the errors and overestimated statisti- 
cal significance. Nevertheless, this does not yet falsify the 
tidal-torque theory for two reasons: First, we indeed see in- 
dications for potential autocorrelations, though they are not 
statistically significant. These indications arc consistent with 
the theoretically predicted correlation length of lMpc//i. 
Improving the data might help to test these indications. 
Second, using a KS-test to analyse the angular-momentum- 
orientation vectors in the Local Group, the null hypothesis of 
random orientation yields a p- value of 64.8%, i.e., it cannot 
be rejected. Therefore, there is no evidence that disc align- 
ment is at work in the Local Group. Third, the tidal-torque 
theory predicts the alignment for angular momenta of dark- 
matter haloes and not for the disc galaxies residing inside 
these haloes. For instances, van den Bosch et al. (2002) find 
a median misalignment of angular momenta of disc galax- 
ies and their host haloes of « 30°. Furthermore, even minor 
mergers can significantly disturb the angular momenta of 
disc galaxies by transferring orbital angular momentum (e.g. 
Moster et al. 2010). Conversely, we could speculate whether 
there is some relaxation process compensating, e.g., for per- 
turbations by mergers. However, we do not want to push 
this discussion too far because we arc wary of turning the 
tidal-torque theory from an empirical into a "vampirical" 
hypothesis where virtually any observational result can be 
explained such that an empirical falsification becomes im- 
possible (Gelman & Weakliem 2009). 

We must conclude that with currently available SDSS 
data it is not possible to place decisive constraints on the free 
parameters of theoretical models. We discussed that already 
a full-sky survey of SDSS quality might improve the situar 
tion such that these autocorrelations could become statisti- 
cally significant. Furthermore, we argued that photometric 
redshift estimates of SDSS quality have too large errors to 
be useful for this task, instead spectroscopic redshift esti- 
mates are neccssarj^ Finally, we discussed that a front-edge 
classification of disc galaxies might improve the autocorre- 
lation estimate of angular-momentum orientation, since it 
breaks the geometric degeneracy of the galaxy's disc incli- 
nation. However, we find that imaging data allows visual 
front-edge classification only for a minute fraction of objects 
in the catalogue, whereas automated front-edge classification 
is severly hampered by foreground stars and star-forming 
regions. Unfortunately, there are no upcoming surveys that 
fulfill all these requirements. Consequently, the search for 
autocorrelations of angular momenta of disc galaxies may 
remain an open issue for the unforeseeable future. 

We demonstrated that ellipticity estimates based on 
second moments of the galaxies' light distributions are 
strongly biased by the presence of galactic bulges even for 
Scd galaxies. This bias corrupts autocorrelation estimates 
of angular-momentum orientation because it dominates over 
the expected astrophysical signal. For instances, this leads to 
an overestimation of the impact of disc alignment in weak- 
lensing studies (Blazek et al. 2011). 
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This vector is again normalised, i.e., £2 ■ £2 = 1, and also 
orthogonal to the first, i.e., £1 • £2 = 0. 

A2 Pairs of correlated orientation vectors 

In the first step, we sample a pair of uncorrelated angular- 
momentum-orientation vectors ii and £2 as described in the 
previous section. In the second step, we mix these two uncor- 
related vectors such that we obtain two correlated vectors. 

La = cos a£i + sin a £2 , (A3) 

L'a, = cos /3£i+ sin /3 £2 , ( A4) 

and their counter-parts due to the front-edge degeneracy. 

Lb = cos a 1^/1 — 2(er • /i)erj + sin a ^£2 — 2(er • £2)erj , 

(A5) 

L'l, = cos/3 [/i - 2(e; • /i)e;] + sin/3 [/a - 2(e; • £2)6^'^ , 

(A6) 

where eV and are unit vectors pointing from the coordi- 
nate origin towards the positions of both galaxies. Due to 
the orthonormality of £1 and £2, all these vectors are unit 
vectors. The two mixing angles a and /3 have to be chosen 
such that the desired input correlation 

^input = I {{{La ■ L'af) + ((La • L'bf) + {{U ' L'^f) 

■ Lif)) - I (A7) 

is exhibited by the sampled pairs of orientation vectors. This 
provides only a single constraint, i.e., we are allowed to freely 
choose one mixing angle. For convenience, we choose a = 
such that La = £1, which simplifies the calculations. We now 
need to compute the four expectation values. 



APPENDIX A: SIMULATING PAIRS OF 

ANGULAR-MOMENTUM-ORIENTATION 

VECTORS 

In this appendix, we explain how to simulate pairs of 
angular-momentum-orientation vectors which should exhibit 
a given correlation. 

Al Uncorrelated, orthonormal orientation vectors 

As the orientation vectors indicate directions, the samples 
are drawn from the uniform distributions <^ £ [0, 2n) and 
cos?? e [—1) 1] of the two polar angles if and 'd. A random 
orientation vector is then given by 

(costpsini? \ 
sintpsini? ] . (Al) 
COSl? / 

This vector is normalised, i.e., £i-£i = 1. Sampling a uniform 
angle 4> £ [0, 27r), a second random orientation vector is 

(— sin (p \ / cos ip cos 1? \ 

cos ip + cos (j) sin ip cos "d 1 . (A2) 
/ V -sin?? / 



A 2.1 Computing the first term 

We start by computing {{La ■ L'aY), which is the simplest 
term and also presents the basic arithmetic steps. Evidently, 

La • Ll = cos^/i -fi + sin/3 /i •/2 . (A8) 

Using £1 ■ £\ = 1 and £1 ■ £2 = 0, this expression simplifies to 

La ■L'a = cos p. (A9) 

The autocorrelation is then given by 

((La-Ll)2) = cos'/?. (AlO) 

A 2. 2 Computing the other terms 

The other three terms in Eq. (A7) are computed in precisely 
the same way. We obtain 

{{La-L'bf) = ^cos'l3+^sm^l3. (All) 

As the correlation estimate is invariant under exchanging 
the pair, we can directly conclude that 

{{U-L'af) = J-cos''p+^sm^f3 , (A12) 
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as well. The final term is given by 




which depends on the angular separation eV • ev of the galaxy 
pair that is simulated. This dependence is inherited from 
flipping the radial component of both angular-momentum- 
orientation vectors due to an unknown front edge. 

A2.3 Mixing angle 

Inserting all four terms into Eq. (A7), we can solve for the 
mixing angle for a given input correlation. The result is 



This mixing angle is used in Sect. 6.3. 

This paper has been typeset from a T^jX/ I^T^iX file prepared 
by the author. 




(A14) 



